Skip to content

Add custom provider protocol support for OpenAI-compatible and Gemini-native APIs#20

Open
lartpang wants to merge 9 commits intoResearAI:mainfrom
lartpang:main
Open

Add custom provider protocol support for OpenAI-compatible and Gemini-native APIs#20
lartpang wants to merge 9 commits intoResearAI:mainfrom
lartpang:main

Conversation

@lartpang
Copy link
Copy Markdown

@lartpang lartpang commented Apr 26, 2026

@HughYau @TuchuanLin @ResearAI

This PR adds protocol-aware API configuration so AutoFigure can work with third-party/custom model providers, not only the built-in OpenRouter, Bianxie, and Google Gemini presets.

It explicitly supports both OpenAI-compatible chat/completions APIs and Gemini Native generateContent APIs across the web workflow.

Changes

API protocol support

  • Add shared API protocol helpers for OpenAI-compatible and Gemini Native endpoints.
  • Add Gemini Native text generation support for layout generation and prompt conversion paths.
  • Normalize provider base URLs for common endpoint forms:
    • /v1
    • /chat/completions
    • /openai
    • /gemini/v1beta
    • /gemini/v1beta/models
  • Improve direct request URL construction for OpenAI-compatible image-generation endpoints.

Web UI provider configuration

  • Add a Custom provider option.
  • Add protocol selectors for:
    • Layout Generation LLM
    • Methodology Extraction LLM
    • Beautification Code2Prompt LLM
    • Image Generation API
  • Pass protocol configuration from the frontend to the backend.
  • Surface backend error messages in the UI instead of showing only generic HTTP status text.
  • Add clearer hover and ARIA descriptions for iteration control buttons.

Documentation

  • Document minimal Windows web startup commands without requiring helper scripts.
  • Document optional PDF_API_URL support for remote PDF-to-Markdown extraction.
  • Document the Web UI model mapping for text/SVG models and image generation models.
  • Add third-party API compatibility guidance for:
    • Custom + OpenAI Compatible
    • Custom + Gemini Native
  • Clarify the recommended Gemini Native base URL format.

Windows/frontend compatibility

  • Avoid standalone output on native Windows because traced chunk filenames may contain :, which is invalid on Windows filesystems.
  • Synchronize the lockfile root package name with package.json.
  • Preserve the original Google font loading behavior to minimize visual changes from upstream.

Why

Previously, third-party Gemini-style APIs were only partially supported.

OpenAI-compatible providers could often be used by reusing an existing preset, but Gemini Native generateContent endpoints were not consistently supported across layout generation, methodology extraction, Code2Prompt, and image generation.

The Web UI also did not let users explicitly choose the API protocol, which made custom provider setup ambiguous. This PR makes the provider/protocol relationship explicit and supports third-party Gemini Native endpoints more reliably.

Example

If a provider documents an endpoint like:

https://provider.example.com/api/v1/gemini/v1beta/models

configure the Web UI as:

Provider: Custom
Protocol: Gemini Native
Base URL: https://provider.example.com/api/v1/gemini/v1beta
Model: gemini-3.1-pro-preview

Endpoints ending in /models are normalized automatically, but the version-level base URL is the recommended input.

Notes

Gemini Native APIs commonly put the API key in the query string as ?key=.... The explicit debug log masks this value, but a future hardening improvement could sanitize exception messages from lower-level HTTP errors as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant